Overview

Dataset statistics

Number of variables25
Number of observations213775
Missing cells1313523
Missing cells (%)24.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory163.9 MiB
Average record size in memory803.7 B

Variable types

Numeric9
Categorical15
Boolean1

Alerts

AIRPORT_ID has a high cardinality: 2465 distinct values High cardinality
LOCATION has a high cardinality: 512 distinct values High cardinality
OPID has a high cardinality: 585 distinct values High cardinality
RUNWAY has a high cardinality: 808 distinct values High cardinality
STATE has a high cardinality: 63 distinct values High cardinality
TIME has a high cardinality: 1468 distinct values High cardinality
df_index is highly correlated with INCIDENT_YEARHigh correlation
DISTANCE is highly correlated with HEIGHT and 1 other fieldsHigh correlation
HEIGHT is highly correlated with DISTANCE and 1 other fieldsHigh correlation
INCIDENT_YEAR is highly correlated with df_indexHigh correlation
SPEED is highly correlated with DISTANCE and 1 other fieldsHigh correlation
df_index is highly correlated with INCIDENT_YEARHigh correlation
DISTANCE is highly correlated with HEIGHT and 1 other fieldsHigh correlation
HEIGHT is highly correlated with DISTANCE and 1 other fieldsHigh correlation
INCIDENT_YEAR is highly correlated with df_indexHigh correlation
SPEED is highly correlated with DISTANCE and 1 other fieldsHigh correlation
df_index is highly correlated with INCIDENT_YEARHigh correlation
DISTANCE is highly correlated with HEIGHTHigh correlation
HEIGHT is highly correlated with DISTANCE and 1 other fieldsHigh correlation
INCIDENT_YEAR is highly correlated with df_indexHigh correlation
SPEED is highly correlated with HEIGHTHigh correlation
AC_CLASS is highly correlated with ENROUTE_STATEHigh correlation
INDICATED_DAMAGE is highly correlated with ENROUTE_STATEHigh correlation
PRECIPITATION is highly correlated with ENROUTE_STATEHigh correlation
FAAREGION is highly correlated with STATEHigh correlation
PHASE_OF_FLIGHT is highly correlated with ENROUTE_STATEHigh correlation
TIME_OF_DAY is highly correlated with ENROUTE_STATEHigh correlation
SKY is highly correlated with ENROUTE_STATEHigh correlation
STATE is highly correlated with FAAREGIONHigh correlation
ENROUTE_STATE is highly correlated with AC_CLASS and 6 other fieldsHigh correlation
AC_MASS is highly correlated with ENROUTE_STATEHigh correlation
df_index is highly correlated with ENROUTE_STATE and 1 other fieldsHigh correlation
DISTANCE is highly correlated with ENROUTE_STATE and 3 other fieldsHigh correlation
ENROUTE_STATE is highly correlated with df_index and 15 other fieldsHigh correlation
FAAREGION is highly correlated with ENROUTE_STATE and 4 other fieldsHigh correlation
HEIGHT is highly correlated with DISTANCE and 3 other fieldsHigh correlation
INCIDENT_MONTH is highly correlated with ENROUTE_STATEHigh correlation
INCIDENT_YEAR is highly correlated with df_index and 1 other fieldsHigh correlation
LATITUDE is highly correlated with ENROUTE_STATE and 4 other fieldsHigh correlation
LONGITUDE is highly correlated with ENROUTE_STATE and 4 other fieldsHigh correlation
PHASE_OF_FLIGHT is highly correlated with DISTANCE and 3 other fieldsHigh correlation
PRECIPITATION is highly correlated with SKYHigh correlation
SKY is highly correlated with PRECIPITATIONHigh correlation
SPEED is highly correlated with DISTANCE and 4 other fieldsHigh correlation
STATE is highly correlated with ENROUTE_STATE and 4 other fieldsHigh correlation
TIME_OF_DAY is highly correlated with ENROUTE_STATEHigh correlation
WARNED is highly correlated with ENROUTE_STATEHigh correlation
AC_MASS is highly correlated with ENROUTE_STATE and 1 other fieldsHigh correlation
INDICATED_DAMAGE is highly correlated with ENROUTE_STATEHigh correlation
c_score is highly correlated with ENROUTE_STATE and 4 other fieldsHigh correlation
DISTANCE has 59548 (27.9%) missing values Missing
ENROUTE_STATE has 213769 (> 99.9%) missing values Missing
HEIGHT has 87483 (40.9%) missing values Missing
LOCATION has 213190 (99.7%) missing values Missing
PHASE_OF_FLIGHT has 65371 (30.6%) missing values Missing
PRECIPITATION has 103893 (48.6%) missing values Missing
RUNWAY has 30089 (14.1%) missing values Missing
SKY has 99952 (46.8%) missing values Missing
SPEED has 133030 (62.2%) missing values Missing
TIME has 98272 (46.0%) missing values Missing
TIME_OF_DAY has 78080 (36.5%) missing values Missing
AC_CLASS has 65249 (30.5%) missing values Missing
AC_MASS has 65597 (30.7%) missing values Missing
LOCATION is uniformly distributed Uniform
df_index has unique values Unique
DISTANCE has 137490 (64.3%) zeros Zeros
HEIGHT has 54541 (25.5%) zeros Zeros

Reproduction

Analysis started2022-01-18 21:19:21.349045
Analysis finished2022-01-18 21:20:16.260431
Duration54.91 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
UNIQUE

Distinct213775
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean121333.1461
Minimum0
Maximum255908
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size1.6 MiB
2022-01-18T21:20:16.402428image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile11296.7
Q161636.5
median120496
Q3181928.5
95-th percentile231086.3
Maximum255908
Range255908
Interquartile range (IQR)120292

Descriptive statistics

Standard deviation70272.93468
Coefficient of variation (CV)0.5791734322
Kurtosis-1.18210924
Mean121333.1461
Median Absolute Deviation (MAD)60173
Skewness0.006462416908
Sum2.593799331 × 1010
Variance4938285348
MonotonicityStrictly increasing
2022-01-18T21:20:16.816430image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
< 0.1%
1612631
 
< 0.1%
1611681
 
< 0.1%
1611701
 
< 0.1%
1611711
 
< 0.1%
1611721
 
< 0.1%
1611731
 
< 0.1%
1611741
 
< 0.1%
1611761
 
< 0.1%
1611771
 
< 0.1%
Other values (213765)213765
> 99.9%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
101
< 0.1%
ValueCountFrequency (%)
2559081
< 0.1%
2559071
< 0.1%
2559061
< 0.1%
2558401
< 0.1%
2557581
< 0.1%
2557561
< 0.1%
2557551
< 0.1%
2557541
< 0.1%
2554211
< 0.1%
2554171
< 0.1%

AIRPORT_ID
Categorical

HIGH CARDINALITY

Distinct2465
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size12.4 MiB
KDEN
 
7786
KDFW
 
6752
KORD
 
5371
KJFK
 
4418
KMEM
 
4152
Other values (2460)
185296 

Length

Max length5
Median length4
Mean length3.996608584
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique837 ?
Unique (%)0.4%

Sample

1st rowKMIE
2nd rowKMSY
3rd rowKORD
4th rowKMHT
5th rowKELP

Common Values

ValueCountFrequency (%)
KDEN7786
 
3.6%
KDFW6752
 
3.2%
KORD5371
 
2.5%
KJFK4418
 
2.1%
KMEM4152
 
1.9%
KSMF3272
 
1.5%
KSLC3223
 
1.5%
KMCO3008
 
1.4%
KDTW2962
 
1.4%
KLGA2847
 
1.3%
Other values (2455)169984
79.5%

Length

2022-01-18T21:20:17.011287image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
kden7877
 
3.7%
kdfw6752
 
3.2%
kord5371
 
2.5%
kjfk4418
 
2.1%
kmem4152
 
1.9%
ksmf3272
 
1.5%
kslc3223
 
1.5%
kmco3008
 
1.4%
kdtw2962
 
1.4%
klga2847
 
1.3%
Other values (2454)169893
79.5%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

DISTANCE
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
ZEROS

Distinct181
Distinct (%)0.1%
Missing59548
Missing (%)27.9%
Infinite0
Infinite (%)0.0%
Mean0.7795403204
Minimum0
Maximum99
Zeros137490
Zeros (%)64.3%
Negative0
Negative (%)0.0%
Memory size1.6 MiB
2022-01-18T21:20:17.200285image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile5
Maximum99
Range99
Interquartile range (IQR)0

Descriptive statistics

Standard deviation3.532196689
Coefficient of variation (CV)4.531127636
Kurtosis90.67904576
Mean0.7795403204
Median Absolute Deviation (MAD)0
Skewness7.804500252
Sum120226.165
Variance12.47641345
MonotonicityNot monotonic
2022-01-18T21:20:17.432292image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0137490
64.3%
12295
 
1.1%
51931
 
0.9%
31613
 
0.8%
21584
 
0.7%
101425
 
0.7%
4874
 
0.4%
15803
 
0.4%
0.5710
 
0.3%
20673
 
0.3%
Other values (171)4829
 
2.3%
(Missing)59548
27.9%
ValueCountFrequency (%)
0137490
64.3%
0.012
 
< 0.1%
0.052
 
< 0.1%
0.173
 
< 0.1%
0.1254
 
< 0.1%
0.1311
 
< 0.1%
0.151
 
< 0.1%
0.161
 
< 0.1%
0.2133
 
0.1%
0.25148
 
0.1%
ValueCountFrequency (%)
993
 
< 0.1%
951
 
< 0.1%
902
 
< 0.1%
803
 
< 0.1%
756
 
< 0.1%
652
 
< 0.1%
621
 
< 0.1%
6015
< 0.1%
553
 
< 0.1%
521
 
< 0.1%

ENROUTE_STATE
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct5
Distinct (%)83.3%
Missing213769
Missing (%)> 99.9%
Memory size6.5 MiB
RI
NY
GA
NJ
AB

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)66.7%

Sample

1st rowNY
2nd rowGA
3rd rowNJ
4th rowRI
5th rowRI

Common Values

ValueCountFrequency (%)
RI2
 
< 0.1%
NY1
 
< 0.1%
GA1
 
< 0.1%
NJ1
 
< 0.1%
AB1
 
< 0.1%
(Missing)213769
> 99.9%

Length

2022-01-18T21:20:17.658285image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-01-18T21:20:17.997283image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
ri2
33.3%
ny1
16.7%
ga1
16.7%
nj1
16.7%
ab1
16.7%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

FAAREGION
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size12.2 MiB
ASO
43702 
AEA
35181 
AGL
33589 
ASW
28227 
AWP
27224 
Other values (5)
45852 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAGL
2nd rowASW
3rd rowAGL
4th rowANE
5th rowASW

Common Values

ValueCountFrequency (%)
ASO43702
20.4%
AEA35181
16.5%
AGL33589
15.7%
ASW28227
13.2%
AWP27224
12.7%
ANM21979
10.3%
ACE10216
 
4.8%
ANE7971
 
3.7%
FGN4409
 
2.1%
AAL1277
 
0.6%

Length

2022-01-18T21:20:18.480282image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-01-18T21:20:18.657297image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
aso43702
20.4%
aea35181
16.5%
agl33589
15.7%
asw28227
13.2%
awp27224
12.7%
anm21979
10.3%
ace10216
 
4.8%
ane7971
 
3.7%
fgn4409
 
2.1%
aal1277
 
0.6%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

HEIGHT
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
ZEROS

Distinct636
Distinct (%)0.5%
Missing87483
Missing (%)40.9%
Infinite0
Infinite (%)0.0%
Mean819.2182165
Minimum0
Maximum25000
Zeros54541
Zeros (%)25.5%
Negative0
Negative (%)0.0%
Memory size1.6 MiB
2022-01-18T21:20:19.011284image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median30
Q3700
95-th percentile4500
Maximum25000
Range25000
Interquartile range (IQR)700

Descriptive statistics

Standard deviation1791.618142
Coefficient of variation (CV)2.186985232
Kurtosis17.16208504
Mean819.2182165
Median Absolute Deviation (MAD)30
Skewness3.59538724
Sum103460707
Variance3209895.565
MonotonicityNot monotonic
2022-01-18T21:20:19.370283image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
054541
25.5%
1005781
 
2.7%
505196
 
2.4%
2004414
 
2.1%
5004266
 
2.0%
10003819
 
1.8%
103379
 
1.6%
3003339
 
1.6%
30002789
 
1.3%
20002722
 
1.3%
Other values (626)36046
16.9%
(Missing)87483
40.9%
ValueCountFrequency (%)
054541
25.5%
1134
 
0.1%
2206
 
0.1%
3175
 
0.1%
480
 
< 0.1%
51103
 
0.5%
641
 
< 0.1%
744
 
< 0.1%
868
 
< 0.1%
920
 
< 0.1%
ValueCountFrequency (%)
250003
< 0.1%
243001
 
< 0.1%
240001
 
< 0.1%
230001
 
< 0.1%
220001
 
< 0.1%
213001
 
< 0.1%
210004
< 0.1%
200007
< 0.1%
190001
 
< 0.1%
185002
 
< 0.1%

INCIDENT_MONTH
Real number (ℝ≥0)

HIGH CORRELATION

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.216727868
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.6 MiB
2022-01-18T21:20:19.843286image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q15
median8
Q39
95-th percentile11
Maximum12
Range11
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.781289303
Coefficient of variation (CV)0.3853947875
Kurtosis-0.575629048
Mean7.216727868
Median Absolute Deviation (MAD)2
Skewness-0.3823638586
Sum1542756
Variance7.735570187
MonotonicityNot monotonic
2022-01-18T21:20:20.124283image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
831016
14.5%
728562
13.4%
928069
13.1%
1025532
11.9%
519642
9.2%
618142
8.5%
1114753
6.9%
414499
6.8%
310314
 
4.8%
128936
 
4.2%
Other values (2)14310
6.7%
ValueCountFrequency (%)
17382
 
3.5%
26928
 
3.2%
310314
 
4.8%
414499
6.8%
519642
9.2%
618142
8.5%
728562
13.4%
831016
14.5%
928069
13.1%
1025532
11.9%
ValueCountFrequency (%)
128936
 
4.2%
1114753
6.9%
1025532
11.9%
928069
13.1%
831016
14.5%
728562
13.4%
618142
8.5%
519642
9.2%
414499
6.8%
310314
 
4.8%

INCIDENT_YEAR
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct31
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2009.814132
Minimum1990
Maximum2020
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.6 MiB
2022-01-18T21:20:20.490284image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1990
5-th percentile1994
Q12005
median2012
Q32016
95-th percentile2019
Maximum2020
Range30
Interquartile range (IQR)11

Descriptive statistics

Standard deviation7.827250144
Coefficient of variation (CV)0.003894514433
Kurtosis-0.4634070087
Mean2009.814132
Median Absolute Deviation (MAD)5
Skewness-0.6916566006
Sum429648016
Variance61.26584481
MonotonicityNot monotonic
2022-01-18T21:20:20.909738image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
201914643
 
6.8%
201813966
 
6.5%
201712767
 
6.0%
201512045
 
5.6%
201411950
 
5.6%
201611758
 
5.5%
202010263
 
4.8%
20139950
 
4.7%
20129755
 
4.6%
20119213
 
4.3%
Other values (21)97465
45.6%
ValueCountFrequency (%)
19901702
0.8%
19912224
1.0%
19922437
1.1%
19932471
1.2%
19942501
1.2%
19952588
1.2%
19962742
1.3%
19973175
1.5%
19983551
1.7%
19994157
1.9%
ValueCountFrequency (%)
202010263
4.8%
201914643
6.8%
201813966
6.5%
201712767
6.0%
201611758
5.5%
201512045
5.6%
201411950
5.6%
20139950
4.7%
20129755
4.6%
20119213
4.3%

LATITUDE
Real number (ℝ)

HIGH CORRELATION

Distinct2458
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean36.90004351
Minimum-37.673333
Maximum71.28545
Zeros0
Zeros (%)0.0%
Negative267
Negative (%)0.1%
Memory size1.6 MiB
2022-01-18T21:20:21.297736image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum-37.673333
5-th percentile26.07258
Q133.30783
median38.81809
Q340.8501
95-th percentile45.44906
Maximum71.28545
Range108.958783
Interquartile range (IQR)7.54227

Descriptive statistics

Standard deviation6.877066742
Coefficient of variation (CV)0.1863701527
Kurtosis10.71662481
Mean36.90004351
Median Absolute Deviation (MAD)3.425
Skewness-1.535535012
Sum7888306.802
Variance47.29404697
MonotonicityNot monotonic
2022-01-18T21:20:21.832738image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
39.858417786
 
3.6%
32.895956752
 
3.2%
41.97965371
 
2.5%
40.639754418
 
2.1%
35.042424152
 
1.9%
38.695423272
 
1.5%
40.788393223
 
1.5%
28.428893008
 
1.4%
42.212062962
 
1.4%
40.777242847
 
1.3%
Other values (2448)169984
79.5%
ValueCountFrequency (%)
-37.6733331
 
< 0.1%
-37.0080565
 
< 0.1%
-34.83841712
 
< 0.1%
-34.82222269
< 0.1%
-34.5591752
 
< 0.1%
-33.9648061
 
< 0.1%
-33.9461119
 
< 0.1%
-33.39297524
 
< 0.1%
-29.993833332
 
< 0.1%
-27.38331
 
< 0.1%
ValueCountFrequency (%)
71.2854515
< 0.1%
70.3256
 
< 0.1%
70.209951
 
< 0.1%
70.1947610
 
< 0.1%
69.37111271
 
< 0.1%
66.8846834
< 0.1%
66.828531
 
< 0.1%
66.600131
 
< 0.1%
65.75861111
 
< 0.1%
65.697561
 
< 0.1%

LOCATION
Categorical

HIGH CARDINALITY
MISSING
UNIFORM

Distinct512
Distinct (%)87.5%
Missing213190
Missing (%)99.7%
Memory size6.5 MiB
A
 
12
.
 
4
10 MI E
 
4
5 MI FINAL
 
4
5 MILES OUT
 
4
Other values (507)
557 

Length

Max length60
Median length11
Mean length11.01538462
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique474 ?
Unique (%)81.0%

Sample

1st row20 NE BNA
2nd row30 NM EAST OAK
3rd row50 MI E MCI
4th row25 MI NW SUX
5th row30 NW MIA

Common Values

ValueCountFrequency (%)
A12
 
< 0.1%
.4
 
< 0.1%
10 MI E4
 
< 0.1%
5 MI FINAL4
 
< 0.1%
5 MILES OUT4
 
< 0.1%
3 MI FINAL4
 
< 0.1%
15 NW4
 
< 0.1%
5 MI W4
 
< 0.1%
SBN-MEM4
 
< 0.1%
10 MI FINAL4
 
< 0.1%
Other values (502)537
 
0.3%
(Missing)213190
99.7%

Length

2022-01-18T21:20:22.071739image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
mi249
 
12.4%
nm80
 
4.0%
of67
 
3.4%
1066
 
3.3%
559
 
2.9%
n51
 
2.5%
e49
 
2.5%
w48
 
2.4%
miles46
 
2.3%
1546
 
2.3%
Other values (373)1239
62.0%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

LONGITUDE
Real number (ℝ)

HIGH CORRELATION

Distinct2460
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-91.66612381
Minimum-177.381
Maximum178.559228
Zeros0
Zeros (%)0.0%
Negative211819
Negative (%)99.1%
Memory size1.6 MiB
2022-01-18T21:20:22.259740image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum-177.381
5-th percentile-122.37484
Q1-100.49631
median-87.90446
Q3-80.29056
95-th percentile-72.8871
Maximum178.559228
Range355.940228
Interquartile range (IQR)20.20575

Descriptive statistics

Standard deviation25.16118536
Coefficient of variation (CV)-0.274487284
Kurtosis28.86807481
Mean-91.66612381
Median Absolute Deviation (MAD)9.13274
Skewness3.039541983
Sum-19595925.62
Variance633.0852489
MonotonicityNot monotonic
2022-01-18T21:20:22.440734image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-104.6677786
 
3.6%
-97.03726752
 
3.2%
-87.904465371
 
2.5%
-73.778934418
 
2.1%
-89.976674152
 
1.9%
-121.590773272
 
1.5%
-111.977773223
 
1.5%
-81.316033008
 
1.4%
-83.348842962
 
1.4%
-73.872612847
 
1.3%
Other values (2450)169984
79.5%
ValueCountFrequency (%)
-177.38174
< 0.1%
-176.64603062
 
< 0.1%
-171.73277781
 
< 0.1%
-170.7105345
< 0.1%
-170.22255561
 
< 0.1%
-169.66373612
 
< 0.1%
-169.53455
 
< 0.1%
-168.95305561
 
< 0.1%
-166.543526
 
< 0.1%
-165.60410831
 
< 0.1%
ValueCountFrequency (%)
178.5592281
 
< 0.1%
177.4433781
 
< 0.1%
174.7916675
 
< 0.1%
174.113332
 
< 0.1%
171.272039
< 0.1%
167.43332
 
< 0.1%
166.636661
 
< 0.1%
158.20898921
< 0.1%
153.118331
 
< 0.1%
151.1772229
< 0.1%

OPID
Categorical

HIGH CARDINALITY

Distinct585
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size12.2 MiB
UNK
64856 
BUS
17670 
SWA
17345 
AAL
14069 
UAL
10532 
Other values (580)
89303 

Length

Max length5
Median length3
Mean length3.023688457
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique98 ?
Unique (%)< 0.1%

Sample

1st rowPVT
2nd rowTWA
3rd rowUAL
4th rowPVT
5th rowAAL

Common Values

ValueCountFrequency (%)
UNK64856
30.3%
BUS17670
 
8.3%
SWA17345
 
8.1%
AAL14069
 
6.6%
UAL10532
 
4.9%
DAL8865
 
4.1%
FDX8011
 
3.7%
SKW4731
 
2.2%
UPS4527
 
2.1%
USA3775
 
1.8%
Other values (575)59394
27.8%

Length

2022-01-18T21:20:22.658754image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
unk64856
30.3%
bus17670
 
8.3%
swa17345
 
8.1%
aal14069
 
6.6%
ual10532
 
4.9%
dal8865
 
4.1%
fdx8011
 
3.7%
skw4731
 
2.2%
ups4527
 
2.1%
usa3775
 
1.8%
Other values (572)59394
27.8%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

PHASE_OF_FLIGHT
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct11
Distinct (%)< 0.1%
Missing65371
Missing (%)30.6%
Memory size11.3 MiB
Approach
64585 
Landing Roll
27459 
Take-off Run
26340 
Climb
24041 
Descent
 
1974
Other values (6)
 
4005

Length

Max length12
Median length8
Mean length8.927609768
Min length4

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowApproach
2nd rowLanding Roll
3rd rowLanding Roll
4th rowApproach
5th rowApproach

Common Values

ValueCountFrequency (%)
Approach64585
30.2%
Landing Roll27459
12.8%
Take-off Run26340
12.3%
Climb24041
 
11.2%
Descent1974
 
0.9%
Departure1957
 
0.9%
Local802
 
0.4%
Arrival601
 
0.3%
Taxi554
 
0.3%
Parked81
 
< 0.1%
(Missing)65371
30.6%

Length

2022-01-18T21:20:23.017739image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
approach64585
31.9%
landing27459
13.6%
roll27459
13.6%
take-off26340
13.0%
run26340
13.0%
climb24041
 
11.9%
descent1974
 
1.0%
departure1957
 
1.0%
local802
 
0.4%
arrival601
 
0.3%
Other values (3)645
 
0.3%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

PRECIPITATION
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct11
Distinct (%)< 0.1%
Missing103893
Missing (%)48.6%
Memory size9.6 MiB
None
100181 
Rain
 
6700
Fog
 
2279
Snow
 
396
Fog, Rain
 
269
Other values (6)
 
57

Length

Max length15
Median length4
Mean length3.994667006
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNone
2nd rowNone
3rd rowNone
4th rowNone
5th rowNone

Common Values

ValueCountFrequency (%)
None100181
46.9%
Rain6700
 
3.1%
Fog2279
 
1.1%
Snow396
 
0.2%
Fog, Rain269
 
0.1%
Rain, Snow20
 
< 0.1%
Fog, Snow14
 
< 0.1%
None, Snow10
 
< 0.1%
Fog, Rain, Snow5
 
< 0.1%
Fog, None5
 
< 0.1%
(Missing)103893
48.6%

Length

2022-01-18T21:20:23.175734image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
none100199
90.9%
rain6997
 
6.3%
fog2572
 
2.3%
snow445
 
0.4%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

RUNWAY
Categorical

HIGH CARDINALITY
MISSING

Distinct808
Distinct (%)0.4%
Missing30089
Missing (%)14.1%
Memory size11.4 MiB
17R
 
3515
27
 
3422
22L
 
3215
23
 
3178
16R
 
3166
Other values (803)
167190 

Length

Max length31
Median length3
Mean length2.626248054
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique396 ?
Unique (%)0.2%

Sample

1st row1
2nd row22R
3rd row35
4th row26L
5th row24R

Common Values

ValueCountFrequency (%)
17R3515
 
1.6%
273422
 
1.6%
22L3215
 
1.5%
233178
 
1.5%
16R3166
 
1.5%
223165
 
1.5%
17L3117
 
1.5%
313088
 
1.4%
4R3051
 
1.4%
242854
 
1.3%
Other values (798)151915
71.1%
(Missing)30089
 
14.1%

Length

2022-01-18T21:20:23.331753image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
17r3529
 
1.9%
273422
 
1.9%
22l3227
 
1.7%
233178
 
1.7%
16r3176
 
1.7%
223166
 
1.7%
17l3124
 
1.7%
313090
 
1.7%
4r3057
 
1.7%
242856
 
1.5%
Other values (608)152773
82.8%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

SKY
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct3
Distinct (%)< 0.1%
Missing99952
Missing (%)46.8%
Memory size10.2 MiB
No Cloud
54187 
Some Cloud
39354 
Overcast
20282 

Length

Max length10
Median length8
Mean length8.691494689
Min length8

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo Cloud
2nd rowNo Cloud
3rd rowNo Cloud
4th rowNo Cloud
5th rowNo Cloud

Common Values

ValueCountFrequency (%)
No Cloud54187
25.3%
Some Cloud39354
 
18.4%
Overcast20282
 
9.5%
(Missing)99952
46.8%

Length

2022-01-18T21:20:23.510734image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-01-18T21:20:23.643754image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
cloud93541
45.1%
no54187
26.1%
some39354
19.0%
overcast20282
 
9.8%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

SPEED
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct277
Distinct (%)0.3%
Missing133030
Missing (%)62.2%
Infinite0
Infinite (%)0.0%
Mean143.0360518
Minimum0
Maximum400
Zeros15
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size1.6 MiB
2022-01-18T21:20:23.791735image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile75
Q1120
median140
Q3160
95-th percentile250
Maximum400
Range400
Interquartile range (IQR)40

Descriptive statistics

Standard deviation46.09679136
Coefficient of variation (CV)0.3222739358
Kurtosis0.8132594206
Mean143.0360518
Median Absolute Deviation (MAD)20
Skewness0.6118044344
Sum11549446
Variance2124.914174
MonotonicityNot monotonic
2022-01-18T21:20:23.986740image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1409082
 
4.2%
1306432
 
3.0%
1205717
 
2.7%
1505483
 
2.6%
1005009
 
2.3%
1354115
 
1.9%
2503601
 
1.7%
1603421
 
1.6%
1802998
 
1.4%
1102636
 
1.2%
Other values (267)32251
 
15.1%
(Missing)133030
62.2%
ValueCountFrequency (%)
015
 
< 0.1%
11
 
< 0.1%
24
 
< 0.1%
35
 
< 0.1%
44
 
< 0.1%
538
< 0.1%
61
 
< 0.1%
72
 
< 0.1%
83
 
< 0.1%
91
 
< 0.1%
ValueCountFrequency (%)
4002
 
< 0.1%
3801
 
< 0.1%
3741
 
< 0.1%
3651
 
< 0.1%
3631
 
< 0.1%
3552
 
< 0.1%
3541
 
< 0.1%
3508
< 0.1%
3405
< 0.1%
3351
 
< 0.1%

STATE
Categorical

HIGH CARDINALITY
HIGH CORRELATION
HIGH CORRELATION

Distinct63
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size12.0 MiB
TX
20780 
CA
17846 
FL
16600 
NY
 
11751
IL
 
10231
Other values (58)
136567 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowIN
2nd rowLA
3rd rowIL
4th rowNH
5th rowTX

Common Values

ValueCountFrequency (%)
TX20780
 
9.7%
CA17846
 
8.3%
FL16600
 
7.8%
NY11751
 
5.5%
IL10231
 
4.8%
CO9952
 
4.7%
PA7280
 
3.4%
TN7062
 
3.3%
OH6983
 
3.3%
NJ6493
 
3.0%
Other values (53)98797
46.2%

Length

2022-01-18T21:20:24.185740image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
tx20780
 
9.7%
ca17846
 
8.3%
fl16600
 
7.8%
ny11751
 
5.5%
il10231
 
4.8%
co9952
 
4.7%
pa7280
 
3.4%
tn7062
 
3.3%
oh6983
 
3.3%
nj6493
 
3.0%
Other values (53)98797
46.2%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

TIME
Categorical

HIGH CARDINALITY
MISSING

Distinct1468
Distinct (%)1.3%
Missing98272
Missing (%)46.0%
Memory size9.8 MiB
09:00
 
1084
08:00
 
1042
10:00
 
1005
11:00
 
847
08:30
 
836
Other values (1463)
110689 

Length

Max length6
Median length5
Mean length4.999930738
Min length4

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique23 ?
Unique (%)< 0.1%

Sample

1st row19:05
2nd row08:08
3rd row08:10
4th row18:00
5th row20:25

Common Values

ValueCountFrequency (%)
09:001084
 
0.5%
08:001042
 
0.5%
10:001005
 
0.5%
11:00847
 
0.4%
08:30836
 
0.4%
22:00830
 
0.4%
09:30821
 
0.4%
21:00811
 
0.4%
07:00807
 
0.4%
23:00784
 
0.4%
Other values (1458)106636
49.9%
(Missing)98272
46.0%

Length

2022-01-18T21:20:24.339734image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
09:001084
 
0.9%
08:001042
 
0.9%
10:001005
 
0.9%
11:00847
 
0.7%
08:30836
 
0.7%
22:00830
 
0.7%
09:30821
 
0.7%
21:00811
 
0.7%
07:00807
 
0.7%
23:00784
 
0.7%
Other values (1453)106636
92.3%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

TIME_OF_DAY
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct4
Distinct (%)< 0.1%
Missing78080
Missing (%)36.5%
Memory size10.2 MiB
Day
84523 
Night
39874 
Dusk
 
6250
Dawn
 
5048

Length

Max length5
Median length3
Mean length3.67096061
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNight
2nd rowDay
3rd rowDay
4th rowNight
5th rowNight

Common Values

ValueCountFrequency (%)
Day84523
39.5%
Night39874
18.7%
Dusk6250
 
2.9%
Dawn5048
 
2.4%
(Missing)78080
36.5%

Length

2022-01-18T21:20:24.514740image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-01-18T21:20:24.655738image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
day84523
62.3%
night39874
29.4%
dusk6250
 
4.6%
dawn5048
 
3.7%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

WARNED
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size12.6 MiB
Unknown
119844 
No
52850 
Yes
41081 

Length

Max length7
Median length7
Mean length4.995209917
Min length2

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowUnknown
3rd rowUnknown
4th rowNo
5th rowNo

Common Values

ValueCountFrequency (%)
Unknown119844
56.1%
No52850
24.7%
Yes41081
 
19.2%

Length

2022-01-18T21:20:24.839736image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-01-18T21:20:24.943739image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
unknown119844
56.1%
no52850
24.7%
yes41081
 
19.2%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

AC_CLASS
Categorical

HIGH CORRELATION
MISSING

Distinct5
Distinct (%)< 0.1%
Missing65249
Missing (%)30.5%
Memory size10.2 MiB
A
147725 
B
 
792
J
 
4
C
 
4
Y
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowA
2nd rowA
3rd rowA
4th rowA
5th rowA

Common Values

ValueCountFrequency (%)
A147725
69.1%
B792
 
0.4%
J4
 
< 0.1%
C4
 
< 0.1%
Y1
 
< 0.1%
(Missing)65249
30.5%

Length

2022-01-18T21:20:25.111743image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-01-18T21:20:25.290732image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
a147725
99.5%
b792
 
0.5%
j4
 
< 0.1%
c4
 
< 0.1%
y1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

AC_MASS
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct5
Distinct (%)< 0.1%
Missing65597
Missing (%)30.7%
Memory size11.0 MiB
4.0
100564 
3.0
27924 
1.0
 
9439
2.0
 
8604
5.0
 
1647

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row4.0
3rd row4.0
4th row1.0
5th row4.0

Common Values

ValueCountFrequency (%)
4.0100564
47.0%
3.027924
 
13.1%
1.09439
 
4.4%
2.08604
 
4.0%
5.01647
 
0.8%
(Missing)65597
30.7%

Length

2022-01-18T21:20:25.470737image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-01-18T21:20:25.588756image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
4.0100564
67.9%
3.027924
 
18.8%
1.09439
 
6.4%
2.08604
 
5.8%
5.01647
 
1.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

INDICATED_DAMAGE
Boolean

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size208.9 KiB
False
199475 
True
 
14300
ValueCountFrequency (%)
False199475
93.3%
True14300
 
6.7%
2022-01-18T21:20:25.670755image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

c_score
Real number (ℝ≥0)

HIGH CORRELATION

Distinct47
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.0783049
Minimum0.03
Maximum9.17
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.6 MiB
2022-01-18T21:20:25.804740image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0.03
5-th percentile0.72
Q12.58
median3.29
Q34.54
95-th percentile9.17
Maximum9.17
Range9.14
Interquartile range (IQR)1.96

Descriptive statistics

Standard deviation2.422975187
Coefficient of variation (CV)0.5941133012
Kurtosis-0.3346380145
Mean4.0783049
Median Absolute Deviation (MAD)1.12
Skewness0.7533522629
Sum871839.63
Variance5.870808759
MonotonicityNot monotonic
2022-01-18T21:20:25.991783image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=47)
ValueCountFrequency (%)
9.1719177
 
9.0%
7.2812796
 
6.0%
4.3212208
 
5.7%
4.5411372
 
5.3%
7.5810491
 
4.9%
2.5810415
 
4.9%
3.9910079
 
4.7%
3.519776
 
4.6%
6.229682
 
4.5%
2.959586
 
4.5%
Other values (37)98193
45.9%
ValueCountFrequency (%)
0.03234
0.1%
0.04286
0.1%
0.05188
 
0.1%
0.08106
 
< 0.1%
0.09110
 
0.1%
0.11355
0.2%
0.12124
 
0.1%
0.13185
 
0.1%
0.17153
 
0.1%
0.18507
0.2%
ValueCountFrequency (%)
9.1719177
9.0%
7.5810491
4.9%
7.2812796
6.0%
6.229682
4.5%
4.5411372
5.3%
4.3212208
5.7%
3.9910079
4.7%
3.978535
4.0%
3.519776
4.6%
3.296162
 
2.9%

Interactions

2022-01-18T21:20:07.592114image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:19:48.478781image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:19:51.006420image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:19:53.183420image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:19:55.323418image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:19:57.899038image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:20:00.425036image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:20:03.060816image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:20:05.613816image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:20:07.822121image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:19:48.721769image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:19:51.285420image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:19:53.420421image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:19:55.558424image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:19:58.171038image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:20:00.692037image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:20:03.305794image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:20:05.817521image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:20:08.039092image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:19:48.927787image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:19:51.516423image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:19:53.672416image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:19:55.775416image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:19:58.418035image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:20:00.926822image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:20:03.520795image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:20:06.026115image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:20:08.316094image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:19:49.194772image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:19:51.768418image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:19:53.890421image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:19:56.041035image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:19:58.676037image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:20:01.209821image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:20:03.777799image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:20:06.234091image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:20:08.569115image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:19:49.700763image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:19:51.996418image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:19:54.125416image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:19:56.317035image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:19:58.954036image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:20:01.738821image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:20:04.066793image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:20:06.490117image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:20:08.836092image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:19:49.967767image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:19:52.232420image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:19:54.349417image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:19:56.625035image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:19:59.250036image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:20:02.043817image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:20:04.380794image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:20:06.705116image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:20:09.093209image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:19:50.242763image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:19:52.529423image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:19:54.629419image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:19:56.887063image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:19:59.570038image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:20:02.330816image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:20:04.660797image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:20:06.908092image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:20:09.319205image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:19:50.480765image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:19:52.718416image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:19:54.832421image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:19:57.356037image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:19:59.793041image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:20:02.548820image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:20:04.893814image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:20:07.119115image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:20:09.587223image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:19:50.770765image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:19:52.974434image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:19:55.071440image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:19:57.652040image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:20:00.104041image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:20:02.815817image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:20:05.191817image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-18T21:20:07.336093image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2022-01-18T21:20:26.153801image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-01-18T21:20:26.716785image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-01-18T21:20:27.142789image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-01-18T21:20:27.561801image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-01-18T21:20:27.803782image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-01-18T21:20:10.332429image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2022-01-18T21:20:12.641810image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-01-18T21:20:14.941828image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2022-01-18T21:20:15.581807image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

df_indexAIRPORT_IDDISTANCEENROUTE_STATEFAAREGIONHEIGHTINCIDENT_MONTHINCIDENT_YEARLATITUDELOCATIONLONGITUDEOPIDPHASE_OF_FLIGHTPRECIPITATIONRUNWAYSKYSPEEDSTATETIMETIME_OF_DAYWARNEDAC_CLASSAC_MASSINDICATED_DAMAGEc_score
00KMIENaNNaNAGL200.010199040.24235NaN-85.39586PVTApproachNoneNaNNo Cloud70.0INNaNNightNoA1.0False2.95
11KMSY0.0NaNASW0.08199329.99339NaN-90.25803TWALanding RollNaN1NaNNaNLANaNDayUnknownA4.0False2.08
22KORD0.0NaNAGL0.08199641.97960NaN-87.90446UALLanding RollNaN22RNaNNaNILNaNNaNUnknownA4.0False4.54
33KMHT8.0NaNANE1800.09199342.93452NaN-71.43706PVTApproachNone35No Cloud150.0NHNaNDayNoA1.0True3.09
44KELPNaNNaNASW200.03199131.80667NaN-106.37781AALApproachNone26LNo Cloud135.0TXNaNNightNoA4.0False0.72
55KPDXNaNNaNANM300.010199045.58872NaN-122.59750AALApproachNoneNaNNo Cloud140.0ORNaNNightNoA4.0False3.29
67KLAX9.0NaNAWP4000.04199533.94254NaN-118.40807UALApproachNaN24RNaNNaNCANaNNaNUnknownA4.0False3.97
78KSAT0.0NaNASW0.08199129.53369NaN-98.46978SWALanding RollNoneNaNNo Cloud120.0TXNaNDayNoA4.0False3.51
89KSATNaNNaNASW1500.04199129.53369NaN-98.46978AALApproachNoneNaNOvercast170.0TXNaNNightNoA4.0False3.51
910KBURNaNNaNAWP800.05199034.20062NaN-118.358501AWEApproachNoneNaNNo Cloud120.0CANaNDayYesA4.0False3.97

Last rows

df_indexAIRPORT_IDDISTANCEENROUTE_STATEFAAREGIONHEIGHTINCIDENT_MONTHINCIDENT_YEARLATITUDELOCATIONLONGITUDEOPIDPHASE_OF_FLIGHTPRECIPITATIONRUNWAYSKYSPEEDSTATETIMETIME_OF_DAYWARNEDAC_CLASSAC_MASSINDICATED_DAMAGEc_score
213765255417W910.0NaNAEA0.012200937.104500NaN-79.588700PVTLanding RollNaNNaNNaNNaNVANaNNaNUnknownA1.0True3.25
213766255421KCDW0.0NaNAEA0.03201240.875220NaN-74.281360PVTLanding RollNaN4NaNNaNNJ23:25NightUnknownA1.0True9.17
213767255754KJAX0.0NaNASONaN2202030.494060NaN-81.687860RPAClimbNaN8NaNNaNFLNaNNaNUnknownA4.0False1.14
21376825575557C0.0NaNAGL0.01201542.797167NaN-88.372611BUSLanding RollNaN26NaNNaNWI19:07NightUnknownA1.0True4.54
213769255756KFWS1.0NaNASWNaN6202032.565228NaN-97.308078PVTApproachNaN17RNaNNaNTX12:13DayUnknownA1.0True4.32
21377025575850D0.0NaNAGL0.06201746.009080NaN-88.274026PVTLanding RollNaN30NaNNaNMI10:50DayUnknownA1.0True1.01
213771255840KFSD0.0NaNAGL100.02202043.581350NaN-96.741700LOFClimbNone21OvercastNaNSD18:35DuskUnknownA3.0True1.34
213772255906KLAXNaNNaNAWP100.08200933.942540NaN-118.408070SHQClimbNone25RNo Cloud190.0CA14:48DayUnknownA4.0True3.97
213773255907KPMDNaNNaNAWPNaN12202034.629390NaN-118.084550BOEDepartureNaNNaNNaNNaNCA12:20DayUnknownA4.0True3.97
213774255908KMLINaNNaNAGL300.01202041.448530NaN-90.507540ENYClimbNone27Overcast135.0IL09:59DayUnknownA3.0True2.15